57 research outputs found
All liaisons are dangerous when all your friends are known to us
Online Social Networks (OSNs) are used by millions of users worldwide.
Academically speaking, there is little doubt about the usefulness of
demographic studies conducted on OSNs and, hence, methods to label unknown
users from small labeled samples are very useful. However, from the general
public point of view, this can be a serious privacy concern. Thus, both topics
are tackled in this paper: First, a new algorithm to perform user profiling in
social networks is described, and its performance is reported and discussed.
Secondly, the experiments --conducted on information usually considered
sensitive-- reveal that by just publicizing one's contacts privacy is at risk
and, thus, measures to minimize privacy leaks due to social graph data mining
are outlined.Comment: 10 pages, 5 table
Nepotistic relationships in Twitter and their impact on rank prestige algorithms
Micro-blogging services such as Twitter allow anyone to publish anything, anytime. Needless to say, many of the available contents can be diminished as babble or spam. However, given the number and diversity of users, some valuable pieces of information should arise from the stream of tweets. Thus, such services can develop into valuable sources of up-to-date information (the so-called real-time web) provided a way to find the most relevant/trustworthy/authoritative users is available. Hence, this makes a highly pertinent question for which graph centrality methods can provide an answer. In this paper the author offers a comprehensive survey of feasible algorithms for ranking users in social networks, he examines their vulnerabilities to linking malpractice in such networks, and suggests an objective criterion against which to compare such algorithms. Additionally, he suggests a first step towards ―desensitizing‖ prestige algorithms against cheating by spammers and other abusive use
Leveraging Wikidata's edit history in knowledge graph refinement tasks
Knowledge graphs have been adopted in many diverse fields for a variety of
purposes. Most of those applications rely on valid and complete data to deliver
their results, pressing the need to improve the quality of knowledge graphs. A
number of solutions have been proposed to that end, ranging from rule-based
approaches to the use of probabilistic methods, but there is an element that
has not been considered yet: the edit history of the graph. In the case of
collaborative knowledge graphs (e.g., Wikidata), those edits represent the
process in which the community reaches some kind of fuzzy and distributed
consensus over the information that best represents each entity, and can hold
potentially interesting information to be used by knowledge graph refinement
methods. In this paper, we explore the use of edit history information from
Wikidata to improve the performance of type prediction methods. To do that, we
have first built a JSON dataset containing the edit history of every instance
from the 100 most important classes in Wikidata. This edit history information
is then explored and analyzed, with a focus on its potential applicability in
knowledge graph refinement tasks. Finally, we propose and evaluate two new
methods to leverage this edit history information in knowledge graph embedding
models for type prediction tasks. Our results show an improvement in one of the
proposed methods against current approaches, showing the potential of using
edit information in knowledge graph refinement tasks and opening new promising
research lines within the field.Comment: 18 pages, 7 figures. Submitted to the Journal of Web Semantic
Enseñar informática es como...
Un problema habitual al explicar conceptos informáticos es su alto nivel de abstracción asà como la dificultad para proporcionar ejemplos adecuados y, a la vez, próximos a la experiencia personal de los estudiantes. Una posible solución para estos problemas radica en la utilización de metáforas como organizadores previos que permitan al alumno captar la esencia de los conceptos que emulan para poder extrapolar esos nuevos conocimientos al campo informático y aprender con mayor facilidad los conceptos explicados. En este artÃculo se presentan algunos aspectos de las actuales teorÃas sobre la metáfora asà como las implicaciones que tienen en el ámbito docente. Por último, se muestran unas técnicas muy sencillas para el desarrollo de metáforas didácticas y algunas metáforas construidas mediante su aplicación
Enseñar informática es como...
Un problema habitual al explicar conceptos informáticos es su alto nivel de abstracción asà como la dificultad para proporcionar ejemplos adecuados y, a la vez, próximos a la experiencia personal de los estudiantes. Una posible solución para estos problemas radica en la utilización de metáforas como organizadores previos que permitan al alumno captar la esencia de los conceptos que emulan para poder extrapolar esos nuevos conocimientos al campo informático y aprender con mayor facilidad los conceptos explicados. En este artÃculo se presentan algunos aspectos de las actuales teorÃas sobre la metáfora asà como las implicaciones que tienen en el ámbito docente. Por último, se muestran unas técnicas muy sencillas para el desarrollo de metáforas didácticas y algunas metáforas construidas mediante su aplicación
Survey and evaluation of query intent detection methods
Second ACM International Conference on
Web Search and Data Mining, Barcelona (Spain)User interactions with search engines reveal three main underlying intents, namely navigational, informational, and transactional. By providing more accurate results depending on such query intents the performance of search engines can be greatly improved. Therefore, query classification has been an active research topic for the last years. However, while query topic classification has deserved a specific bakeoff, no evaluation campaign has been devoted to the study of automatic query intent detection. In this paper some of the available query intent detection techniques are reviewed, an evaluation framework is proposed, and it is used to compare those methods in order to shed light on their relative performance and drawbacks. As it will be shown, manually prepared gold-standard files are much needed, and traditional pooling is not the most feasible evaluation method. In addition to this, future lines of work in both query intent detection and its evaluation are propose
- …